Entropy and MDL discretization of continuous variables for Bayesian belief networks

نویسندگان

  • Ellis J. Clarke
  • Bruce A. Barton
چکیده

An efficient algorithm for partitioning the range of a continuous variable to a discrete Ž . number of intervals, for use in the construction of Bayesian belief networks BBNs , is presented here. The partitioning minimizes the information loss, relative to the number of intervals used to represent the variable. Partitioning can be done prior to BBN construction or extended for repartitioning during construction. Prior partitioning allows Ž . either Bayesian or minimum descriptive length MDL metrics to be used to guide BBN construction. Dynamic repartitioning, during BBN construction, is done with a MDL metric to guide construction. The methods are demonstrated with data from two epidemiological studies and these results are compared for all of the methods. The use of the partitioning algorithm resulted in more sparsely connected BBNs, than with binary partitioning, with little information loss from mapping continuous variables into discrete ones. Q 2000 John Wiley & Sons, Inc.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Bayesian Belief Networks: An Approach Based on the MDL Principle

A new approach for learning Bayesian belief networks from raw data is presented The approach is based on Rissanen s Minimal Description Length MDL principle which is particularly well suited for this task Our approach does not require any prior assumptions about the distribution being learned In particular our method can learn unrestricted multiply connected belief networks Furthermore unlike o...

متن کامل

Error-Based and Entropy-Based Discretization of Continuous Features

We present a comparison of error-based and entropybased methods for discretization of continuous features. Our study includes both an extensive empirical comparison as well as an analysis of scenarios where error minimization may be an inappropriate discretization criterion. We present a discretization method based on the C4.5 decision tree algorithm and compare it to an existing entropy-based ...

متن کامل

Bayesian Belief Network Modeling and Diagnosis of Xerographic Systems

1 Graduate student 2 Assistant Professor and corresponding author ABSTRACT In this paper, a Bayesian Belief Network (BBN) approach to the modeling and diagnosis of xerographic printing systems is proposed. First, a continuous BBN model based on physics of the printing process and field data is developed. The model captures the causal relationships between the various physical variables in the s...

متن کامل

Improved Algorithms for Univariate Discretization of Continuous Features

In discretization of a continuous variable its numerical value range is divided into a few intervals that are used in classification. For example, Näıve Bayes can benefit from this processing. A commonlyused supervised discretization method is Fayyad and Irani’s recursive entropy-based splitting of a value range. The technique uses mdl as a model selection criterion to decide whether to accept ...

متن کامل

An examination of the effect of discretization on a naïve Bayes model's performance

A Bayesian network (or a belief network) is a probabilistic graphical model that represents a set of variables and their probabilistic independencies. Some researches often involve continuous random variables. In order to apply these continuous variables to BN models, these variables should convert into discrete variables with limited states, often two. During the discretization process, one pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Int. J. Intell. Syst.

دوره 15  شماره 

صفحات  -

تاریخ انتشار 2000